23 research outputs found

    Managing polyglot systems metadata with hypergraphs

    Get PDF
    A single type of data store can hardly fulfill every end-user requirements in the NoSQL world. Therefore, polyglot systems use different types of NoSQL datastores in combination. However, the heterogeneity of the data storage models makes managing the metadata a complex task in such systems, with only a handful of research carried out to address this. In this paper, we propose a hypergraph-based approach for representing the catalog of metadata in a polyglot system. Taking an existing common programming interface to NoSQL systems, we extend and formalize it as hypergraphs for managing metadata. Then, we define design constraints and query transformation rules for three representative data store types. Furthermore, we propose a simple query rewriting algorithm using the catalog itself for these data store types and provide a prototype implementation. Finally, we show the feasibility of our approach on a use case of an existing polyglot system.Peer ReviewedPostprint (author's final draft

    Pathobiome driven gut inflammation in Pakistani children with environmental enteric dysfunction

    Get PDF
    Environmental Enteric Dysfunction (EED) is an acquired small intestinal inflammatory condition underlying high rates of stunting in children \u3c5 years of age in low- and middle-income countries. Children with EED are known to have repeated exposures to enteropathogens and environmental toxins that leads to malabsorptive syndrome. We aimed to characterize association of linear growth faltering with enteropathogen burden and subsequent changes in EED biomarkers. In a longitudinal birth cohort (n = 272), monthly anthropometric measurements (Length for Age Z score- LAZ) of asymptomatic children were obtained up to 18 months. Biological samples were collected at 6 and 9 months for the assessment of biomarkers. A customized TaqMan array card was used to target 40 enteropathogens in fecal samples. Linear regression was applied to study the effect of specific enteropathogen infection on change in linear growth (ΔLAZ). Presence of any pathogen in fecal sample correlated with serum flagellin IgA (6 mo, r = 0.19, p = 0.002), fecal Reg 1b (6 mo, r = 0.16, p = 0.01; 9mo, r = 0.16, p = 0.008) and serum Reg 1b (6 mo, r = 0.26, p\u3c0.0001; 9 mo, r = 0.15, p = 0.008). At 6 months, presence of Campylobacter [β (SE) 7751.2 (2608.5), p = 0.003] and ETEC LT [β (SE) 7089.2 (3015.04), p = 0.019] was associated with increase in MPO. Giardia was associated with increase in Reg1b [β (SE) 72.189 (26.394), p = 0.006] and antiflic IgA[β (SE) 0.054 (0.021), p = 0.0091]. Multiple enteropathogen infections in early life negatively correlated with ΔLAZ, and simultaneous changes in gut inflammatory and permeability markers. A combination vaccine targeting enteropathogens in early life could help in the prevention of future stuntin

    Serum anti-flagellin and anti-lipopolysaccharide immunoglobulins as predictors of linear growth faltering in Pakistani infants at risk for environmental enteric dysfunction

    Get PDF
    Background: Environmental Enteric Dysfunction (EED) in children from low-income countries has been linked to linear growth declines. There is a critical need to identify sensitive and early EED biomarkers.Objective: Determine whether levels of antibodies against bacterial components flagellin (flic) and lipopolysaccharide (LPS) predict poor growth.Design/Methods: In a prospective birth cohort of 380 children in rural Pakistan blood and stool samples were obtained at ages 6 and 9 months. Linear mixed effects models were used to examine longitudinal associations between quartiles of anti-flic and anti-LPS antibodies and changes in LAZ, WAZ and WLZ scores. Spearman\u27s correlations were measured between anti-flic and anti-LPS immunoglobulins with measures of systemic/enteric inflammation and intestinal regeneration.Results: Anti-LPS IgA correlated significantly with CRP, AGP and Reg1 serum at 6mo and with MPO at 9mo. In multivariate analysis at 6mo of age, higher anti-LPS IgA levels predicted greater declines in LAZ scores over subsequent 18mo (comparing highest to lowest quartile, β (SE) change in LAZ score/year = -0.313 (0.125), p-value = 0.013). Anti-flic Ig A in the two highest quartiles measured at 9mo of age had declines in LAZ of -0.269 (0.126), p = 0.033; and -0.306 (0.129), p = 0.018 respectively, during the subsequent 18mo of life, compared to those in the lowest quartile of anti-flic IgA.Conclusions and Relevance: Elevated anti-flic IgA and anti-LPS IgA antibodies at 6 and 9mo, predict declines in linear growth. Systemic and enteric inflammation correlated with anti-LPS IgA provides mechanistic considerations for potential future interventions

    New genetic loci link adipose and insulin biology to body fat distribution.

    Get PDF
    Body fat distribution is a heritable trait and a well-established predictor of adverse metabolic outcomes, independent of overall adiposity. To increase our understanding of the genetic basis of body fat distribution and its molecular links to cardiometabolic traits, here we conduct genome-wide association meta-analyses of traits related to waist and hip circumferences in up to 224,459 individuals. We identify 49 loci (33 new) associated with waist-to-hip ratio adjusted for body mass index (BMI), and an additional 19 loci newly associated with related waist and hip circumference measures (P < 5 × 10(-8)). In total, 20 of the 49 waist-to-hip ratio adjusted for BMI loci show significant sexual dimorphism, 19 of which display a stronger effect in women. The identified loci were enriched for genes expressed in adipose tissue and for putative regulatory elements in adipocytes. Pathway analyses implicated adipogenesis, angiogenesis, transcriptional regulation and insulin resistance as processes affecting fat distribution, providing insight into potential pathophysiological mechanisms

    Performance Prediction for Concurrent Database Workloads

    No full text
    Current trends in data management systems, such as cloud and multi-tenant databases, are leading to data processing environments that concurrently execute heterogeneous query workloads. At the same time, these systems need to satisfy diverse performance expectations. In these newly-emerging settings, avoiding potential Quality-of-Service (QoS) violations heavily relies on performance predictability, i.e., the ability to estimate the impact of concurrent query execution on the performance of individual queries in a continuously evolving workload. This paper presents a modeling approach to estimate the impact of concurrency on query performance for analytical workloads. Our solution relies on the analysis of query behavior in isolation, pairwise query interactions and sampling techniques to predict resource contention under various query mixes and concurrency levels. We introduce a simple yet powerful metric that accurately captures the joint effects of disk and memory contention on query performance in a single value. We also discuss predicting the execution behavior of a time-varying query workload through queryinteraction timelines, i.e., a fine-grained estimation of the time segments during which discrete mixes will be executed concurrently. Our experimental evaluation on top of PostgreSQL/TPC-H demonstrates that our models can provide query latency predictions within approximately 20 % of the actual values in the average case

    Packing Light: Portable Workload Performance Prediction for the Cloud

    No full text
    Abstract — We introduce a new learning-based solution for portable database workload performance prediction. The current state of the art addresses performance prediction for individual, static hardware configurations and thus cannot generalize to new platforms without additional training. In this work, we focus on analytical databases that might be deployed on different hardware configurations, possibly offered by various Infrastructureas-a-Service (IaaS) providers in the cloud. Enabling workload performance predictions that can be ported across hardware configurations and IaaS offerings could significantly help cloud users with their service-purchase decisions and cloud providers with their provisioning decisions. Our solution is based on collaborative filtering modeling and prediction. We applied it to lightweight workload fingerprints that model the characteristics and behavior of concurrent query workloads for carefully selected, abstract hardware configurations. Our preliminary results are derived from experiments with TPC-H and TPC-DS benchmarks on the Amazon and Rackspace clouds. They demonstrate that our techniques can predict analytical workload throughput values for diverse hardware platforms with low training overhead and within approximately 30 % of the correct figure. I

    The BigDAWG Polystore System and Architecture

    No full text
    © 2016 IEEE. Organizations are often faced with the challenge of providing data management solutions for large, heterogenous datasets that may have different underlying data and programming models. For example, a medical dataset may have unstructured text, relational data, time series waveforms and imagery. Trying to fit such datasets in a single data management system can have adverse performance and efficiency effects. As a part of the Intel Science and Technology Center on Big Data, we are developing a polystore system designed for such problems. BigDAWG (short for the Big Data Analytics Working Group) is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands and a middleware that provides a uniform multi-island interface. Initial results from a prototype of the BigDAWG system applied to a medical dataset validate polystore concepts. In this article, we will describe polystore databases, the current BigDAWG architecture and its application on the MIMIC II medical dataset, initial performance results and our future development plans

    A Demonstration of the BigDAWG Polystore System

    Get PDF
    This paper presents BigDAWG, a reference implementation of a new architecture for “Big Data” applications. Such applications not only call for large-scale analytics, but also for real-time streaming support, smaller analytics at interactive speeds, data visualization, and cross-storage-system queries. Guided by the principle that “one size does not fit all”, we build on top of a variety of storage engines, each designed for a specialized use case. To illustrate the promise of this approach, we demonstrate its effectiveness on a hospital application using data from an intensive care unit (ICU). This complex application serves the needs of doctors and researchers and provides real-time support for streams of patient data. It showcases novel approaches for querying across multiple storage engines, data visualization, and scalable real-time analytics
    corecore